AITopics | positive observation

Collaborating Authors

positive observation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Hidden Cost of Waiting for Accurate Predictions

Shirali, Ali, Procaccia, Ariel, Abebe, Rediet

arXiv.org Artificial IntelligenceMar-1-2025

Algorithmic predictions are increasingly informing societal resource allocations by identifying individuals for targeting. Policymakers often build these systems with the assumption that by gathering more observations on individuals, they can improve predictive accuracy and, consequently, allocation efficiency. An overlooked yet consequential aspect of prediction-driven allocations is that of timing. The planner has to trade off relying on earlier and potentially noisier predictions to intervene before individuals experience undesirable outcomes, or they may wait to gather more observations to make more precise allocations. We examine this tension using a simple mathematical model, where the planner collects observations on individuals to improve predictions over time. We analyze both the ranking induced by these predictions and optimal resource allocation. We show that though individual prediction accuracy improves over time, counter-intuitively, the average ranking loss can worsen. As a result, the planner's ability to improve social welfare can decline. We identify inequality as a driving factor behind this phenomenon. Our findings provide a nuanced perspective and challenge the conventional wisdom that it is preferable to wait for more accurate predictions to ensure the most efficient allocations.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.0065

Country:

North America > United States > California (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.92)
Government > Regional Government > North America Government > United States Government (0.92)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Augmented prediction of a true class for Positive Unlabeled data under selection bias

Mielniczuk, Jan, Wawrzeńczyk, Adam

arXiv.org Machine LearningJul-14-2024

We introduce a new observational setting for Positive Unlabeled (PU) data where the observations at prediction time are also labeled. This occurs commonly in practice -- we argue that the additional information is important for prediction, and call this task "augmented PU prediction". We allow for labeling to be feature dependent. In such scenario, Bayes classifier and its risk is established and compared with a risk of a classifier which for unlabeled data is based only on predictors. We introduce several variants of the empirical Bayes rule in such scenario and investigate their performance. We emphasise dangers (and ease) of applying classical classification rule in the augmented PU scenario -- due to no preexisting studies, an unaware researcher is prone to skewing the obtained predictions. We conclude that the variant based on recently proposed variational autoencoder designed for PU scenario works on par or better than other considered variants and yields advantage over feature-only based methods in terms of accuracy for unlabeled samples.

dataset, prediction, scenario, (16 more...)

arXiv.org Machine Learning

2407.10309

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

Verifying the Selected Completely at Random Assumption in Positive-Unlabeled Learning

Teisseyre, Paweł, Furmańczyk, Konrad, Mielniczuk, Jan

arXiv.org Machine LearningMar-29-2024

The goal of positive-unlabeled (PU) learning is to train a binary classifier on the basis of training data containing positive and unlabeled instances, where unlabeled observations can belong either to the positive class or to the negative class. Modeling PU data requires certain assumptions on the labeling mechanism that describes which positive observations are assigned a label. The simplest assumption, considered in early works, is SCAR (Selected Completely at Random Assumption), according to which the propensity score function, defined as the probability of assigning a label to a positive observation, is constant. On the other hand, a much more realistic assumption is SAR (Selected at Random), which states that the propensity function solely depends on the observed feature vector. SCAR-based algorithms are much simpler and computationally much faster compared to SAR-based algorithms, which usually require challenging estimation of the propensity score. In this work, we propose a relatively simple and computationally fast test that can be used to determine whether the observed data meet the SCAR assumption. Our test is based on generating artificial labels conforming to the SCAR case, which in turn allows to mimic the distribution of the test statistic under the null hypothesis of SCAR. We justify our method theoretically. In experiments, we demonstrate that the test successfully detects various deviations from SCAR scenario and at the same time it is possible to effectively control the type I error. The proposed test can be recommended as a pre-processing step to decide which final PU algorithm to choose in cases when nature of labeling mechanism is not known.

assumption, probability, proceedings, (15 more...)

arXiv.org Machine Learning

2404.00145

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Single-sample versus case-control sampling scheme for Positive Unlabeled data: the story of two scenarios

Mielniczuk, Jan, Wawrzeńczyk, Adam

arXiv.org Artificial IntelligenceDec-4-2023

In the paper we argue that performance of the classifiers based on Empirical Risk Minimization (ERM) for positive unlabeled data, which are designed for case-control sampling scheme may significantly deteriorate when applied to a single-sample scenario. We reveal why their behavior depends, in all but very specific cases, on the scenario. Also, we introduce a single-sample case analogue of the popular non-negative risk classifier designed for case-control data and compare its performance with the original proposal. We show that the significant differences occur between them, especiall when half or more positive of observations are labeled. The opposite case when ERM minimizer designed for the case-control case is applied for single-sample data is also considered and similar conclusions are drawn. Taking into account difference of scenarios requires a sole, but crucial, change in the definition of the Empirical Risk.

czyk single-sample versus case-control, scenario, single-sample versus case-control, (16 more...)

arXiv.org Artificial Intelligence

2312.02095

Country:

North America > United States > California (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.61)

Add feedback

ORGAN: Observation-Guided Radiology Report Generation via Tree Reasoning

Hou, Wenjun, Xu, Kaishuai, Cheng, Yi, Li, Wenjie, Liu, Jiang

arXiv.org Artificial IntelligenceJun-10-2023

This paper explores the task of radiology report generation, which aims at generating free-text descriptions for a set of radiographs. One significant challenge of this task is how to correctly maintain the consistency between the images and the lengthy report. Previous research explored solving this issue through planning-based methods, which generate reports only based on high-level plans. However, these plans usually only contain the major observations from the radiographs (e.g., lung opacity), lacking much necessary information, such as the observation characteristics and preliminary clinical diagnoses. To address this problem, the system should also take the image information into account together with the textual plan and perform stronger reasoning during the generation process. In this paper, we propose an observation-guided radiology report generation framework (ORGAN). It first produces an observation plan and then feeds both the plan and radiographs for report generation, where an observation graph and a tree reasoning mechanism are adopted to precisely enrich the plan information by capturing the multi-formats of each observation. Experimental results demonstrate that our framework outperforms previous state-of-the-art methods regarding text quality and clinical efficacy

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.06466

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(12 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

Convolutional Neural Network for Breast Cancer Classification

#artificialintelligenceJan-31-2023, 10:05:59 GMT

Click here to read the full story with my Friend Link! Breast cancer is the second most common cancer in women and men worldwide. In 2012, it represented about 12 percent of all new cancer cases and 25 percent of all cancers in women. Breast cancer starts when cells in the breast begin to grow out of control. These cells usually form a tumor that can often be seen on an x-ray or felt as a lump. The tumor is malignant (cancer) if the cells can grow into (invade) surrounding tissues or spread (metastasize) to distant areas of the body.

artificial intelligence, batch size, machine learning, (17 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Deep learning with multi modalities phenotypes and biomarkers

#artificialintelligenceMar-1-2022, 22:20:05 GMT

There is a growing interest in the biomedical world in utilizing multi-modal multi-featured machine learning applications to create models that can predict disease development. Identifying vulnerability to the development of health problems entails important prevention options including treatments and lifestyle changes. Working with multi modalities data requires additional steps and preparation, making sure that the combined modalities don't skew the results. In the current example, we used a dataset that includes a combination of demographics, clinical diagnosis, genetics, and biomarker features. We used supervised deep learning with Python/Keras to create a model for identifying individuals with vulnerability to develop major depression.

dataset, multi modality phenotype and biomarker, positive observation, (11 more...)

#artificialintelligence

Genre: Research Report (0.32)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)

Add feedback

Quantifying With Only Positive Training Data

Reis, Denis dos, de Souto, Marcílio, de Sousa, Elaine, Batista, Gustavo

arXiv.org Machine LearningOct-12-2021

Quantification is the research field that studies methods for counting the number of data points that belong to each class in an unlabeled sample. Traditionally, researchers in this field assume the availability of labelled observations for all classes to induce a quantification model. However, we often face situations where the number of classes is large or even unknown, or we have reliable data for a single class. When inducing a multi-class quantifier is infeasible, we are often concerned with estimates for a specific class of interest. In this context, we have proposed a novel setting known as One-class Quantification (OCQ). In contrast, Positive and Unlabeled Learning (PUL), another branch of Machine Learning, has offered solutions to OCQ, despite quantification not being the focal point of PUL. This article closes the gap between PUL and OCQ and brings both areas together under a unified view. We compare our method, Passive Aggressive Threshold (PAT), against PUL methods and show that PAT generally is the fastest and most accurate algorithm. PAT induces quantification models that can be reused to quantify different samples of data. We additionally introduce Exhaustive TIcE (ExTIcE), an improved version of the PUL algorithm Tree Induction for c Estimation (TIcE). We show that ExTIcE quantifies more accurately than PAT and the other assessed algorithms in scenarios where several negative observations are identical to the positive ones.

algorithm, experiment, positive observation, (13 more...)

arXiv.org Machine Learning

2004.10356

Country:

North America > United States (1.00)
South America > Brazil > São Paulo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.68)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

30 Most Asked Machine Learning Questions Answered - KDnuggets

#artificialintelligenceAug-25-2021, 01:16:06 GMT

Machine Learning is the path to a better and advanced future. A Machine Learning Developer is the most demanding job in 2021, and it is going to increase by 20–30% in the upcoming 3–5 years. Machine Learning by the core is all statistics and programming concepts. The language that is mostly used by Machine learning developers for coding is python because of its simplicity. In this blog, you will find some of the most asked machine learning questions that every machine learning enthusiast has to answer one day. Ans: Machine learning is the science of getting computers to act in a real-time situation without being explicitly programmed.

algorithm, dataset, prediction, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

30 Basic Machine Learning Questions Answered

#artificialintelligenceJun-20-2021, 17:21:00 GMT

Machine Learning is the path to a better and advanced future. A Machine Learning Developer is the most demanding job in 2021 and it is going to increase by 20–30% in the upcoming 3–5 years. Machine Learning by the core is all statistics and programming concepts. The language that is mostly used by Machine learning developers for coding is python because of its simplicity. In this blog, you will some of the most asked machine learning questions that every machine learning enthusiast has to answer one day.

algorithm, dataset, prediction, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback